Robustly Learning a Gaussian: Getting Optimal Error, Efficiently
نویسندگان
چکیده
We study the fundamental problem of learning the parameters of a high-dimensional Gaussian in the presence of noise — where an ε-fraction of our samples were chosen by an adversary. We give robust estimators that achieve estimation error O(ε) in the total variation distance, which is optimal up to a universal constant that is independent of the dimension. In the case where just the mean is unknown, our robustness guarantee is optimal up to a factor of √ 2 and the running time is polynomial in d and 1/ε. When both the mean and covariance are unknown, the running time is polynomial in d and quasipolynomial in 1/ε. Moreover all of our algorithms require only a polynomial number of samples. Our work shows that the same sorts of error guarantees that were established over fifty years ago in the one-dimensional setting can also be achieved by efficient algorithms in high-dimensional settings. Supported by NSF CAREER Award CCF-1652862, a Sloan Research Fellowship, and a Google Faculty Research Award. Supported by NSF CCF-1551875, CCF-1617730, CCF-1650733, and ONR N00014-12-1-0999. Supported by NSF CAREER Award CCF-1553288 and a Sloan Research Fellowship. Supported by NSF CAREER Award CCF-1453261, CCF-1565235, a Google Faculty Research Award, and an NSF Graduate Research Fellowship. Supported by NSF CAREER Award CCF-1453261, CCF-1565235, a Packard Fellowship, a Sloan Research Fellowship, a grant from the MIT NEC Corporation, and a Google Faculty Research Award. Supported by a USC startup grant.
منابع مشابه
Linear Time Varying MPC Based Path Planning of an Autonomous Vehicle via Convex Optimization
In this paper a new method is introduced for path planning of an autonomous vehicle. In this method, the environment is considered cluttered and with some uncertainty sources. Thus, the state of detected object should be estimated using an optimal filter. To do so, the state distribution is assumed Gaussian. Thus the state vector is estimated by a Kalman filter at each time step. The estimation...
متن کاملGaussian process dynamic programming
Reinforcement learning (RL) and optimal control of systems with continuous states and actions require approximation techniques in most interesting cases. In this article, we introduce Gaussian process dynamic programming (GPDP), an approximate value-function based RL algorithm. We consider both a classic optimal control problem, where problem-specific prior knowledge is available, and a classic...
متن کاملRobustly representing uncertainty through sampling in deep neural networks
As deep neural networks (DNNs) are applied to increasingly challenging problems, they will need to be able to represent their own uncertainty. Modelling uncertainty is one of the key features of Bayesian methods. Using Bernoulli dropout with sampling at prediction time has recently been proposed as an efficient and well performing variational inference method for DNNs. However, sampling from ot...
متن کاملCan Gaussian Process Regression Be Made Robust Against Model Mismatch?
Learning curves for Gaussian process (GP) regression can be strongly affected by a mismatch between the ‘student’ model and the ‘teacher’ (true data generation process), exhibiting e.g. multiple overfitting maxima and logarithmically slow learning. I investigate whether GPs can be made robust against such effects by adapting student model hyperparameters to maximize the evidence (data likelihoo...
متن کاملGeneralization Errors and Learning Curves for Regression with Multi-task Gaussian Processes
We provide some insights into how task correlations in multi-task Gaussian process (GP) regression affect the generalization error and the learning curve. We analyze the asymmetric two-tasks case, where a secondary task is to help the learning of a primary task. Within this setting, we give bounds on the generalization error and the learning curve of the primary task. Our approach admits intuit...
متن کامل